Learning graph based quantum query algorithms 
for finding constant-size subgraphs* 

Troy Lee^ 1 , Frederic Magniez* 2 , and Miklos Santha^ 1 ' 2 

1 Centre for Quantum Technologies, National University of Singapore, Singapore 117543 
2 CNRS, LIAFA, Univ Paris Diderot, Sorbonne Paris Cite, F- 75205 Paris, France 



Abstract 

Let H be a fixed k- vertex graph with m edges and minimum degree d > 0. We use the 
learning graph framework of Belovs to show that the bounded-error quantum query complexity 
of determining if an n-vertex graph contains H as a subgraph is 0(n 2 ~ 2 / fe_t ), where 

( fc 2 -2(m+l) 2k- d- 3 \ 

1 ~ maX \fc(fc + l)(m + l)' k(d+l)(m-d + 2) J 

The previous best algorithm of Magniez et al. had complexity 0{n 2 ~ 2 / k ). 



1 Introduction 

Quantum query complexity. Quantum query complexity has been a very successful model 
for studying the power of quantum computation. Important quantum algorithms, in particular 
the search algorithm of Grover [Gro96| and the period finding subroutine of Shor's factoring al- 
gorithm jSh o97| . can be formulated in this model, yet it is still simple enough that one can often 
prove tight lower bounds. This model is the quantum analog of deterministic and randomized de- 
cision tree complexities; the resource measured is the number of queries to the input and all other 
operations are for free. 

For promise problems the quantum query complexity can be exponentially smaller than the 
classical complexity, the Hidden Subgroup Problem [Sim97, EHK99J being the most striking ex- 
ample. The situation is dramatically different for total functions, as Beals et al. [BB C + 0l] showed 
that in this case the deterministic and the quantum query complexities are polynomially related. 

One rich source of concrete problems are functions related to properties of graphs. Graph 
problems were first studied in the quantum query model by Buhrman et al. [BCWZ99J and later 
by Buhrman et al. [BDH + 05] . who looked at Triangle Finding together with Element Distinctness. 
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This was followed by the exhaustive work of Diirr et al. [DHHM06J who investigated many standard 
graph problems including Connectivity, Strong Connectivity, Minimum Spanning Tree, and Single 
Source Shortest Paths. All these approaches were based on clever uses of Grover's search algorithm. 
The groundbreaking work of Ambainis [Amb07] using quantum walks for Element Distinctness 
initiated the study of quantum walk based search algorithms. Magniez et al. }MSS07j used this 
technique to design quantum query algorithms for finding constant size subgraphs, and recently 
Childs and Kothari found a novel application of this framework to decide minor-closed graph 
properties [CKllj . The results of [MSS07J imply that a &;-vertex subgraph can be found with 
d(n 2 - 2 ' k ) queries, and moreover Triangle Finding is solvable with 0(n ) queries. Later, quantum 
phase estimation techniques [MNRSllJ were also applied to these problems, and in particular the 
quantum query complexity of Triangle Finding was improved to 0(n 13 ). The best lower bound 
known for finding any constant sized subgraph is the trivial f2(n). 

The general adversary bound and learning graphs. Recently, there have been exciting 
developments leading to a characterization of quantum query complexity in terms of a (relatively) 
simple semidefinite program, the general adversary bound [Reil 1 j. IlMR + 11] , Now to design quan- 
tum algorithms it suffices to exhibit a solution to this semidefinite program. This plan turns out to 
be quite difficult as the minimization form of the general adversary bound (the easiest form to up- 
per bound) has exponentially many constraints. Even for simple functions it is difficult to directly 
come up with a feasible solution, much less worry about finding a solution with good objective 
value. 

Belovs |Bell2b| recently introduced the model of learning graphs, which can be viewed as the 
minimization form of the general adversary bound with additional structure imposed on the form 
of the solution. This additional structure makes learning graphs much easier to reason about. In 
particular, it ensures that the feasibility constraints are automatically satisfied, allowing one to 
focus on coming up with a solution having a good objective value. Learning graphs are a very 
promising model and have already been used to improve the complexity of Triangle Finding to 
0(n 35 / 27 ) |Bell2b| and to give an o(n 3//4 ) algorithm for fc-Element Distinctness [Bell2a| . improving 
the previous bound of 0{n k ^ k+l ^) |Amb07] . 

Our contribution. We give two learning graph based algorithms for the problem of deter- 
mining if a graph G contains a fixed /c-vertex subgraph H. Throughout the paper we will assume 
that k > 2, as the problem of determining if G contains an edge is equivalent to search. We denote 
by m the number of edges in H. The first algorithm we give has complexity 0(n 2 ~ 2 l k ~ t ) where 
t = (k 2 — 2(m + l))/(k(k + l)(m + 1)) > 0. The second algorithm depends on the minimum degree 
of a vertex in H. Say that the smallest degree of a vertex in H is d > 0. This is without loss of 
generality as isolated vertices of H can be removed and the theorem applied to the resulting graph 
H' . The second algorithm has complexity 0{n 2 ~ 2 l k ~ t ) where t = (2k — d — 3)/(k(d+l)(m + 2)) > 0. 
Both algorithms thus improve on the previous best general subgraph finding algorithm of [MSS07J, 
which has complexity 0(n 2 ~ 2 / k ). The first algorithm performs better, for example, on dense regular 
graphs H, while the second algorithm performs better on the important case where H is a triangle, 
having complexity 0(n 35 / 27 ), equal to that of the algorithm of Belovs |Bell2b| . 

To explain these algorithms, we first give a high level description of the learning graph algorithm 
in |Bell2b] for Triangle Finding, and its relation to the quantum walk algorithm given in [MSS07]. 
The learning graph algorithm in [Be ll2bj for Triangle Finding is roughly a translation of the quan- 
tum walk algorithm on the Johnson graph of [MSS07J into the learning graph framework, with one 
additional twist. This is to maintain a database not of all edges present in G amongst a subset 
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of r-vertices but rather a random sample of these edges. We will refer to this as sparsifying the 
database. While in the quantum walk world this idea does not help, in the context of learning 
graphs it leads to a better algorithm. 

The quantum walk of |MSS07] works by looking for a subgraph H' = H \ {v}, where v is a 
vertex of minimal degree in H, and then (using the algorithm for element distinctness) finding the 
vertex v and the edges linking it to H' to form H. Our second learning graph algorithm translates 
this procedure into the learning graph framework, and again applies the trick of sparsifying the 
database. Our first algorithm is simpler and translates the quantum walk searching for H directly 
to the learning graph framework, again maintaining a sparsified database. 

The way we apply sparsification differs from how it is used in |Bell2bj . There every edge slot 
is taken independently with some fixed probability, while in our case the sparse random graphs are 
chosen uniformly from a set of structured multipartite graphs whose edge pattern reflects that of 
the given subgraph. The probability space evolves during the algorithm, but at every stage the 
multipartite graphs have a very regular degree structure. This uniformity of the probability space 
renders the structure of the learning graph very transparent. 



Related contribution. Independently of our work, Zhu [Zhull] also obtained Theorem 10 



His algorithm is also based on learning graphs, but differs from ours in working with randomly 
sparsified cliques as in the algorithm of Belovs [Bcll2b] for Triangle Finding, rather than graphs 
with specified degrees as we do. 



2 Preliminaries 

We denote by [N] the set {1,2, . . . ,N}. The quantum query complexity of a function /, denoted 
Q(f), is the number of input queries needed to evaluate / with error at most 1/3. We refer the 
reader to the survey |HS05] for precise definitions and background. 

For a boolean function / : T> — > {0, 1} with T> C {0, 1}^, the general adversary bound [HLS07 
denoted ADV ± (/), can be defined as follows (this formulation was first given in [Rei09|): 



ADV =t (/) = minimize max \\u x i 



ie[N] 

subject to ^2 ( u x,i\ u y,i) = 1 for a11 f( x ) f(y) ■ 

ie[N] 

As the general adversary bound characterizes quantum query complexity [Reillj . quantum 
algorithms can be developed (simply!) by devising solutions to this semidefinite program. This 



turns out not to be so simple, however, as even coming up with feasible solutions to Equation (1) 
is not easy because of the large number of strict constraints. 

Learning graphs are a model of computation introduced by Belovs [Bell2b] that give rise to 



solutions of Equation (1) and therefore quantum query algorithms. The model of learning graphs 
is very useful as it ensures that the constraints are satisfied automatically, allowing one to focus on 
coming up with a solution having a good objective value. 

Definition 1 (Learning graph). A learning graph Q is a 5-tuple (V,£,w,£,{p y : y G Y}) where 
(V,£) is a rooted, weighted and directed acyclic graph, the weight function w : £ — > M maps 
learning graph edges to positive real numbers, the length function £ : S — > N assigns each edge a 
natural number, and p y : £ — >■ K is a unit flow whose source is the root, for every y £Y . 
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Definition 2 (Learning graph for a function). Let f : {0, 1}^ — > {0, 1} be a function. A learning 
graph Q for f is a 5-tuple (V,£, S,w,{p y : y G where S : V — > 2 N maps v G V to a 

set S(v) C [N] of variable indices, and (V, £ , w, £, {p y : y G is a learning graph for the 

length function i defined as £((u,v) = \S(v) \ S(u)\ for each edge (u,v). For the root r G V we 
have S(r) = 0, and every learning graph edge e = (u,v) satisfies S(u) C S(v). For each input 
y G the set S(v) contains a 1-certificate for y on f, for every sink v G V of p y . 

Note that it can be the case for an edge (u, v) that S(u) = S(v) and the length of the edge 
is zero. In Belovs |Bell2b] what we define here is called a reduced learning graph, and a learning 
graph is restricted to have all edges of length one. 

Definition 3 (Flow preserving edge sets). A set of edges E C £ is flow preserving, if in the 
subgraph G = (V, E) induced by E, for every vertex v G V which is not a source or a sink in G, 
Y,uevPy(( u ' v )) = !>2w£vPy(( v ' w ))> f or ever V V- For a fl ow preserving set of edges E we let p y (E) 
denote the value of the flow p y over E, that is p y {E) = ^2 s , source in G Y^veV Py(( s ' v ))- 

Observe that p y jp y (E) is a unit flow over E whenever p y (E) ^ 0, and that p y (£) = 1 for every 
y. The complexity of a learning graph is defined as follows. 

Definition 4 (Learning graph complexity). Let Q be a learning graph, and let E C £ a set of flow 
preserving learning graph edges. The negative complexity of E is Cq{E) = J^eeE £(e)w(e). The 
positive complexity of E under the flow p y is 




and otherwise. 



The positive complexity of E is C\(E) = max^gy Ci^ y {E). The complexity of E is C(E) = 
\J Co(E)C\(E) , and the learning graph complexity of Q is C{Q) = C(£). The learning graph 
complexity of a function f, denoted CQ{f), is the minimum learning graph complexity of a learning 
graph for f. 

The usefulness of learning graphs for quantum query complexity is given by the following the- 
orem. 

Theorem 1 (Belovs). Q(f) = 0{£Q{f)). 

We study functions / : {0, l}^) — > {0,1} whose input is an undirected n-vertex graph. We 
will refer to the vertices and edges of the learning graph as L-vertices and L-edges so as not to 
cause confusion with the vertices/edges of the input graph. Furthermore, we will only consider 
learning graphs where every L-vertex is labeled by a /c-partite undirected graph on [n], where k is 
some fixed positive integer. Different L-vertices will have different labels, and we will identify an 
L-vertex with its label. 

3 Analysis of learning graphs 

We first review some tools developed by Belovs to analyze the complexity of learning graphs and 
then develop some new ones useful for the learning graphs we construct. We fix for this section a 
learning graph Q = (V, £, w, £, {p y }). By level d of Q we refer to the set of vertices at distance d from 
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the root. A stage is the set of edges of Q between level i and level j, for some i < j. For a subset 
V C V of the L-vertices let V + = w) £ £ : v £ V} and similarly let V = {(u, v) & £ : v & V}. 
For a vertex t> we will write v + instead of and similarly for v~~ instead of Let E be 

a stage of Q and let V be some subset of the L-vertices at the beginning of the stage. We set 
Ey = {(v, w) E E : v is u or a descendent of u for some it G V}. For a vertex t> we will write L^ 
instead of E'TN • 

{!)} 

Given a learning graph the easiest way to obtain another learning graph is to modify the 
weight function of Q. We will often use this reweighting scheme to obtain learning graphs with 
better complexity or complexity that is more convenient to analyze. When Q is understood from 
the context, and when w' is the new weight function, for any subset E C £ of the L-edges, we 
denote the complexity of E with respect to w' by C w (E). 

An illustration of the reweighting method is the following lemma of Belovs which states that 
we can upper bound the complexity of a learning graph by partitioning it into a constant number 
of stages and summing the complexities of the stages. 

Lemma 2 (Belovs). If £ can be partitioned into a constant number k of stages E\, . . . ,E)., then 
there exists a weight function w' such that C w (Q) = 0{C{E\) + . . . + C(Ek))- 

Now we will focus on evaluating the complexity of a stage. Belovs has given a general theorem 
to simplify the calculation of the complexity of a stage for flows with a high degree of symmetry 
(Theorem 6 in [Bcll2bJ). Our flows will possess this symmetry but rather than apply Belovs' 
theorem, we develop one from scratch that takes further advantage of the regular structure of our 
learning graphs. 

Definition 5 (Consistent flows). Let E be a stage of Q and let V±, . . . ,V S be a partition of the 
L-vertices at the beginning of the stage. We say that {p y } is consistent with Ey^, . . . , Ey ifp y (Ey) 
is independent of y for each i . 

Lemma 3. Let E be a stage of Q and let V\, . . . ,V S be a partition of the L-vertices at the beginning 
of the stage. Set Ei = Ey, and suppose that {p y } is consistent with E±, . . . , E s . Then there is a 
new weight function w' for Q such that 

C W '{E) < maxC(Ei). 

i 

Proof. Since by hypothesis p y (Ei) is independent from y, denote it by on. We assume that on > for 
each i; if a, = then p y ((u, v)) = for every y and (u, v) £ Ei, and these edges can be deleted from 
the graph without affecting anything. For e G Ei, we define the new weight w'(e) = aiCi(Ei)w(e). 
Let us analyze the complexity of E under this weighting. 

To evaluate the positive complexity observe that p y (E) = 1 for every y, since E is a stage, and 
thus Yli cti = 1. Therefore 

Cf\E) = max £ £ < £ * max £ = J> = 1. 

y ^ w'(e) ^ CUE,) y ^ w(e)a? ^ 

The negative complexity can be bounded by 

C (E) = J2Y1 ^Wie) =J2 a ^ E ^ E ^( e M e ) = ^{E^C^Ei) < maxC(^) 2 . 

□ 
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At a high level, we will analyze the complexity of a stage E as follows. First, we partition the 
set of vertices V into equivalence classes [u] = {&(u) : a G S n } for some appropriate action of S n 
that we will define later, and use symmetry to argue that the flow is consistent with {Et^A. Thus 



by Lemma 3 it is enough to focus on the maximum complexity of E7^ . Within E7\ , our flows will 
be ot a particularly simple form. In particular, incoming flow will be uniformly distributed over a 
subset of [u] of fixed size independent of y. The next two lemmas evaluate the complexity of Er*, 
in this situation. 

Lemma 4. Let E be a stage of Q and let V be some subset of the L-vertices at the beginning of 
the stage. For each y let W y QV be the set of vertices in V which receive positive flow under p y . 
Suppose that for every y the following is true: 

1. E^nE^=(bforu^v£V, 

2. \W y \ is independent ofy, 

3. for all v <E W y we have p y (E^) = p y (Ey)/\W y \. 
Then 



veV VV ' V€V V V '\Wy\ 



Proof. The negative complexity can easily be upper bounded by 



Co(E^) = > Co(E?) < \V\ maxCo(^). 

1 — * v£V 



For the positive complexity we have 



C\(Ey) = max x x 



£(e)p y (e) 2 



< 



< 



1 t{e)p y {efu? 



E m t f x E 



max„ g y Ci(E^) 



\W y \ 



□ 



Observe that when E is a stage between two consecutive levels, that is between level i and i + 1 
for some i, and V is a subset of the vertices at the beginning of the stage, then Ey = V + . We will 
use 



Lemma 3 in conjunction with Lemma 4 first in this context. 



Lemma 5. Let E be a stage of Q between two consecutive levels. Let V be the set of L-vertices 
at the beginning of the stage and suppose that each v £ V has outdegree d and all L-edges e of the 
stage satisfy w(e) = 1 and 1(e) < I. Let V±, . . . , V s be a partition of V , and for all y and i, let 
W y< i C Vi be the set of vertices in Vi which receive positive flow under p y . Suppose that 

1. the flows {p y } are consistent with {Vi + }, 
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2. \W y> i\ is independent from y for every i, and for all v £ W y> i wehavep y (v + ) = Py(Vi + )/\W y> i\, 

3. there is a g such that for each vertex v G W V) i the flow is directed uniformly to g of the d 
many neighbors. 

Then there is a new weight function w' such that 



C w \E)<^iJ d -^- . (2) 
1 V 9 \ W V,i\ 



Proof. By hypothesis (1) we are in the realm of Lemma 3 and therefore C w (E) < maxj C(Vi ). To 



evaluate C(Vi + ), we can apply Lemma 4 according to hypothesis (2). The statement of the lemma 
then follows, since for every v £ V we have Cq(v + ) = id, and C\(v + ) = tjg by hypothesis (3). □ 

This lemma will be the main tool we use to analyze the complexity of stages. Note that the 
complexity in Equation (2) can be decomposed into three parts: the length I, the degree ratio d/g, 
and the maximum vertex ratio maxj \ Vi\/\ W y> i\. This terminology will be very helpful to evaluate 
the complexity of stages. 

We will use symmetry to decompose our flows as a convex combinations of uniform flows over 
disjoint sets of edges. Recall that each L- vertex u is labeled by a A;-partite graph on [n], say with 
color classes A\, . . . , A^, and that we identify an L-vertex with its label. For a £ S n we define the 
action of a on u as o~(u) = v, where v is a A:-partite graph with color classes cr(Ai), . . . , cr(Ak) and 
edges {a(i), a(j)} for every edge {i,j} in u. 

Define an equivalence class [u] of L- vertices by [u] = {cr(u) : a £ S n }. We say that S n acts 
transitively on flows {p y } if for every y,y' there is a r £ S n such that p y ((u,v)) = p y i{{r(u),T(v)) 
for all L-edges (u,v). 

As shown in the next lemma, if S n acts transitively on a set of flows {p y } then they are consistent 
with [v] + , where v is a vertex at the beginning of a stage between consecutive levels. This will set 



us up to satisfy hypothesis (1) of Lemma 5 



Lemma 6. Consider a learning graph Q and a set of flows {p y } such that S n acts transitively 
on {p y }. Let V be the set of L-vertices of Q at some given level. Then {p y } is consistent with 
{[u} + : u £ V}, and, similarly, {p y } is consistent with {[u]~ : u £ V}. 

Proof. Let p y ,p y ' be two flows and r £ S n such that p y ((u,v)) = p y /((r(u), t(v)) for all L-edges 
(u, v). Then 

Py([u] + ) = Py(( v > w )) 

vE[u] w.(v,w)££ 
= Y Y Py'((v,w)) 

t~ 1 (v)£[u] t- 1 (w):(t- 1 (v),t- 1 (w))g£ 

= Y Y Py'(( v > w )) =Py'(\ u \ + )- 
The statement p y {[u]~) = p y >([u\~) follows exactly in the same way. □ 
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The next lemma gives a sufficient condition for hypothesis (2) of Lemma 5 to be satisfied. The 
partition of vertices in Lemma 5 will be taken according to the equivalence classes [u] . Note that 
unlike the previous lemmas in this section that only consider a stage of a learning graph, this lemma 
speaks about the learning graph in its entirety. 

Lemma 7. Consider a learning graph and a set of flows {p y } such that S n acts transitively on 
{p y }. Suppose that for every L-vertex u and flow p y such that p y {u~) > 0, 

1. the flow from u is uniformly directed to g + ([u]) many neighbors, 

2. for every L-vertex w, the number of incoming edges with from [w] to u is g~([w], [u]). 

Then for every L-vertex u the flow entering [u] is uniformly distributed over W y r u i C [u] where 
\Wy,[u]\ * s independent ofy. 

Proof. We first use hypotheses ([!]), ^ of Lemma 7 to show that for every flow p y and for every 
L-vertex u, the incoming flow p y (u~) to u is either or a^Qit]) > 0, that is it depends only on 
the equivalence class of u. We then use transitivity and hypothesis ^ of Lemma 7 to reach the 
conclusion of the lemma. 

Let Vt be the set of vertices at level t and fix a flow p y . The proof is then by induction on the 
level t on a stronger statement for every a, a' G S n and L-vertices u £ Vt and v, v' G Vt+v 

\p y ((u,v)) > axidpy((a(u),v')) > 0] => p y ((u, v)) = p y {(a(u), «')), (3) 

\p y (o-(u)~) > and p y (a'(u)~) > 0] p y (a(u)~) = p y (a'(u)~). (4) 

At level t = 0, the statement is correct since the root is unique, has incoming flow 1, and 
outgoing edges with flow or l/g + (root). 

Assume the statements hold up to and including level t. Hypothesis [T] implies that when 
p y ((u,v)) > for u G Vt, it satisfies p y ((u,v)) = p y (u~)/g + ([u]), and similarly p y ((T(u),v')) = 
Py(T(u)~)/g + ([u]). Therefore, Equation [4] at level t implies Equation [3] at level t + 1. 

We now turn to Equation |4] at level t + 1. Fix v G Vt+i and a, a' G S such that c(v) and o~'(v) 
have positive incoming flows. Then 

Py(o-(v)~) = ^2p y ((u,a(v))) and p y (a'(v)-) = ^ p y ((u,a'(v))). 

uev t uev t 

We will show that p y (a[v)~) = p y {o'(v)~) by proving the following equality for every u 

£ Py ((r(u),a(v))) = Y, Py((r(u),a'(v))). 

By Equation|3]at level t, all nonzero terms in the respective sum are identical. By Hypothesis[2j 
the number of nonzero terms is g~ {[u], [v]) in both sums. Therefore the two sums are identical. 

We now have concluded that the incoming flow to an L-vertex u is either or a J/ ([n]) > 0. This 
implies that the flow entering u is uniformly distributed over some set W^m C [u]. We now show 
that the size of this set is independent of y. 

If the flow is transitive then a,, ([it]) is independent of y and furthermore by the second statement 
applied to the level of u, ^2 v& [ u ]Py(v~) = J2ve{u]Py'( v ~)- Thus the number of terms 
must be the same and IW^m) is independent of y. □ 



Lemma 6 



of 



in each sum 
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4 Algorithms 



We first discuss some basic assumptions about the subgraph H. Say that H has k vertices and 
minimum degree d. First, we assume that d > 1, that is H has no disconnected vertices. Recall 
that we are testing if G contains H as a subgraph, not as an induced subgraph. Thus if H' is H 
with disconnected vertices removed, then G will contain H if and only if G contains H' and n > k. 
Furthermore, the algorithms we give in this section behave monotonically with k, and so will have 
smaller complexity on the graph H' . Additionally, we assume that k > 3 as if k = 2 and d = 1 
then H is simply an edge and in the case the complexity is known to be 0(n) as it is equivalent to 
search on 0(ra 2 ) items. 

Thus let H be a graph on vertex set {1, 2, . . . , k}, with k > 3 vertices. We present two algorithms 
in this section for determining if a graph G contains H. Following Belovs, we say that a stage loads 
an edge {a, b} if for all L-edges (u,v) with flow in the stage, we have {a, b} £ S(v) \ S(u). Both 



algorithms will use a subroutine, given in Section 4.1 to load an induced subgraph of H. For 



some integer 1 < u < k, let be the subgraph of H induced by vertices 1,2, ... ,u. The first 



algorithm, given in |Section 4.2[ will take u = k and load H directly; the second algorithm, given 
in 



Section 4.3, will first load -ff[i 5 fc_i], and then search for the missing vertex that completes H. 



4.1 Loading a subgraph of H 

Fix 1 < u < k and and let e\, . . . ,e m be the edges of enumerated in some fixed order. We 

assume that m > 1. For any positive input graph G, that is a graph G which contains a copy of H, 
we fix k vertices ai, a<i, . . . , such that {aj, aj} is an edge of G whenever {i,j} is an edge of H. 

We define a bit of terminology that will be useful. For two sets Y\,Y2 Q [n], we say that a 
bipartite graph between Y\ and Yi is of type ({(ni, d\),. . . , (rij,dj)}, {(mi, gi), . . . , {ran, 9i)}) if Yi 
has i%i vertices of degree di for i = 1, . . . , j, and Y2 has m; vertices of degree gi for i = 1, . . . ,£, and 
this is a complete listing of vertices in the graph, i.e. \Yi\ = Yli=i n * an< ^ |^| = J2i=i m i- 

Vertices of our learning graph will be labeled by a it-partite graph Q on disjoint sets 
X\,...,X U C [n]. The global structure of Q will mimic the edge pattern of Hu u y Namely, 
for each edge et = of #[i, M i, there will be a bipartite graph Qt between Xj and X,- with a 

specified degree sequence. There are no edges between Xi and Xj if {«, j} is not an edge of -Hn «]• 

The mapping 

5 . v -> 2(2) from learning graph vertices to query indices returns the union of the 
edges of Qt for t = 1, . . . , m. 

We now describe the stages of our first learning graph. Let Vf denote the L-vertices at the 
beginning of stage t (and so the end of stage t — 1 for t > 0). The L-edges between V t and VJ+i 
are defined in the obvious way — there is an L-edge between vt G Vt and vt+i £ Vt+i if the graph 
labeling vt is a subgraph of the graph labeling vt+\. We initially set the weight of all L-edges to be 



one, though some edges will be reweighted in the complexity analysis using Lemma 5 The root of 
the learning graph is labeled by the empty graph. 

The algorithm depends on two parameters r, s which will be optimized later. The parameter 
r £ [n] will control the number of vertices, and s £ [0, 1] the edge density, of graphs labeling the 
L-vertices. 



Learning graph Q\\ 
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Stage 0: Setup (Figure [T]). V\ consists of all L- vertices labeled by a u-partite graph Q with 
color classes A\, . .. ,A U C [n], each of size r — 1. The edges will be the union of the edges in 
bipartite graphs Q%, . . . , Q m , where if eg = {i,j} is an edge of fl]i, u ], then Qg is a bipartite graph of 
type ({(r — 1 — rs, rs), (rs, rs — 1)}, {(r — 1 — rs, rs), (rs, rs — 1)}) between Aj and Aj. The number 
of edges added in this stage is 0(sr 2 ). Flow is uniform from the root of the learning graph, whose 
label is the empty graph, to all L- vertices such that a±, . . . , af. Aj for i = 1, . . . , u. 

Stage t for t = l,...,ui Load at (Figures [2] and [3]). Vt+i consists of all L- vertices labeled by 
a u-partite graph Q with color classes £?i, . . . , B t , A t +\, . . . A u , where \Bi\ = r, and \Ai\ = r — 1. 
The edges of Q are the union of edges of bipartite graphs Qi, . . . , Q m , where if eg = {i, j} then the 
type of Qi is given by the following cases: 

• If t < i < j, then Qi is of type ({(r — 1 — rs, rs), (rs, rs — 1)}, {(r — 1 — rs, rs), (rs, rs — 1)}) 
between Ai and Aj. 

• If i < t < j, then Qt is of type ({(r — rs, rs), (rs, rs — 1)}, {(r — 1, rs)}) between Bi and Aj. 

• If i < j < t, then Qt is of type ({(r, rs)}, {(r, rs)}) between Bi and Bj. 

The number of edges added at stage t is 0(rs). The flow is directed uniformly on those L-edges 
where the element added to A t is at and none of the edges {oj, aj} are present. 

Stage u + 1: Hiding (Figure [4]). Now we are ready to start loading edges {a^, aj}. If we simply 
loaded the edge {ai, aj} now, however, it would be uniquely identified by the degrees of ai,aj since 
only these vertices would have degree rs + 1 . This means that for example at the last stage of the 
learning graph the vertex ratio would be Q(n k ~ 1 ), no matter what r is. Thus in this stage we first 
do a "hiding" step, adding edges so that half of the vertices in every set have degree rs + 1. 

Formally, consists of all L-vertices labeled by a u-partite graph Q with color classes 

Bi, ... , B u , where \Bi\ = r. The edges of Q are the union of edges of bipartite graphs Q\,. . . , Q m , 
where if eg = {i,j} then Qi is of type ({(r/2, rs), (r/2, rs + 1)}, {(r/2, rs), (r/2, rs + 1)}) between 
Bi and Bj. The number of edges added in this stage is 0(r). The flow is directed uniformly to 
those L-vertices where for every eg = {i,j}, both a- t and aj have degree rs in Qg. 

Stage u + t + 1 for t = 1, . . . , mi Load {aj, aj} if e% = {i, j} (Figure [5]). Take an L-vertex at 
the beginning of stage u + 1 + 1 whose edges are the union of bipartite graphs Q\, . . . , Q m . In stage 
u + t + 1 only Qt will be modified, by adding single edge {bi, bj} where bi € Bi and bj £ Bj have 
degree rs in Qt. The flow is directed uniformly along those L-edges where bi = ai and bj = aj. 

Thus at the end of stage u+m+1, the L-vertices are labeled by the edges in the union of bipartite 
graphs Qi, Q m each of type ({(r/2 - l,rs), (r/2 + 1, rs + 1)}, {(r/2 - 1, rs), (r/2 + l,rs + 1)}). 
The incoming flow is uniform over those L-vertices where aj € Bi for i = 1, . . . ,u, and if eg = {i,j} 
then the edge {a,, aj} is present in Qg for £ = 1, ... , m, and both aj, aj have degree rs + 1 in Qg. 
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Ai : r — 1 vertices Aj : r — 1 vertices 



Figure 1: Stage 0: Edges added to Gi when e£ = {i,j} is an edge of K. The flow is uniform to 
instances with a±, . . . , a& ^ Ai for i = 1, . . . , k — 1. 



1 new vertex 




B t : A t with 1 new vertex — >• r vertices Aj : r — 1 vertices 

Figure 2: Stage t for t = 1, . . . ,u: rs added edges in some Gg at stage t, when eg = {t,j} with 
t < j. See Figure [3] for the case e£ = {i,t} with i < t. (No edge is added to Gi at stage t when 
e£ = {i,j} with t ^ i and t 7^ j.) The added edges are between the new vertex of At and the rs 
vertices in Aj, respectively Bi, of degree (rs — 1). The flow is directed to instances where the new 
vertex of A t is at- 
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1 new vertex 




Bi : r vertices Bt : At with 1 new vertex — > r vertices 

Figure 3: Stage t for t = 1, . . . ,u: rs added edges in some Gi at stage t, when ei = {i,t} with 
i < t. See Figure [2] for the case e£ = {t,j} with t < j. (No edge is added to Gi at stage t when 
e £ = {hj} with t / i and t / j.) The added edges are between the new vertex of A t and the rs 
vertices in Aj, respectively Bi, of degree (rs — 1). The flow is directed to instances where the new 
vertex of A t is at- 




Bi : r vertices Bj : r vertices 

all of degree rs all of degree rs 



Figure 4: Stage u + 1: We add r/2 vertex-disjoint edges to Gg when eg = {i, j} is an edge of K. 
The flow is directed to instances where the degrees of at and aj remain rs in Ge- 
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Bi : r vertices 



Bj : r vertices 



Figure 5: Stage u + 1 + t for t = 1, . . . ,m: Let e< = Then a single edge is added to Qt 

between two vertices bi G Bi and 6j G Bj of degree rs in Qt- The flow is directed to instances 
where bi = a% and bj = aj. 



Complexity analysis of the stages Note that for an input graph y containing a copy of Hu u i 
the definition of flow depends only on the vertices oi, . . . , a u that span H. As for any two graphs 
y, y' containing H there is a permutation r mapping a copy of H in y to a copy of H in y' we see 
that S n acts transitively on flows. 

Furthermore, by construction of our learning graph, from a vertex v with p y {v~) > 0, flow is 
directed uniformly to g out of d many neighbors, where g, d depend only on the stage, not y or v. 



Additionally, by symmetry of the flow, hypothesis (2) of Lemma 7 is also satisfied. We will invoke 



Lemma 5 to evaluate the cost of each stage. Hypothesis (1) is satisfied by Lemma 6 hypothesis (2 



by Lemma 7 and hypothesis (3) by construction of the learning graph. 



Stage 0: The set of L-vertices at the beginning of this stage is simply the root thus the 
vertex ratio (and maximum vertex ratio) is one. The degree ratio can be upper bounded by 
((n — k)/(n — kr — k)) k = 0(1), as we will choose r = o(n) and k is constant. The length of 
this stage is 0(sr 2 ) and so its complexity is 0{sr 2 ). 

Stage t for t = 1, ... it: An L- vertex in Vt will be used by the flow if and only if m G Bi for 
i = 1, . . . ,t — 1 and a, B\ y . . . , Bf, A t +i, . . . , A u for i = t, . . . , k. For any vertex v G Vt the 
probability over a G S n that cr(v) satisfies the second event is constant thus the vertex ratio 
is dominated by the first event which has probability 0((r/n)' _1 ). Thus the maximum vertex 
ratio is 0((n/r)* _1 ). The degree ratio is n. Since O(sr) edges are added, the complexity is 
0(sr^Wr) (t " 1)/2 ). 



Stage u + 1: As above, an L- vertex in Vt+\ will be used by the flow if and only if aj G Bi for 
i = 1, . . . , u. For any vertex v G Vk+i the probability over a that this is satisfied by c(v) is 
0((r/n) u ) therefore the maximum vertex ratio is 0((n/r) u ). For each = half of the 

vertices in Bi and half of the vertices in Bj will have degree rs in Qp. Therefore, the degree 
ratio is 4 m = O(l). Since 0(r) edges are added, the complexity of this stage is therefore 
0(r(n/r) u / 2 ). 
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Stage u + t + 1 for t = 1, . . . m: In every stage, the degree ratio is 0(r 2 ). An L-vertex is in 
the flow at the beginning of stage u + t + 1 if the following two conditions are satisfied: 

di G Bi for i = l,... u, (5) 
if eg = {i,j} then {oj, a?} G Qe with Oj, a, of degree rs + 1 in Q^, for £ = 1, . . . , t — 1. (6) 



The probability over a that cr(v) satisfies Equation (5) is Cl((r /n) u ). Among vertices in [v] 
satisfying this condition, a further Q(s t ~ 1 ) fraction will satisfy Equation (6) This follows 



from Lemma 8 below, together with the independence of the bipartite graphs Q\, . . . ,Q m . 
Thus the maximum vertex ratio is 0((n/r)"s _ ^ -1 ^). As only one edge is added at this stage, 
we obtain a cost of 0(r(n/r) u ' 2 s~( t ~ 1 " 2 ) . 

Lemma 8. Let Y\,Y2 be disjoint r-element subsets of [n], and let (2/1,2/2) G li x Y^. Let K be a 
bipartite graph between Y\ and I2 of type ({(r/2 — 1, rs), (r/2 + 1, rs + 1)}, {(r/2 — 1, rs), (r/2 + 
l,rs + 1)}). T/ie probability over a € S n t/iai £/ie ed(?e {2/1,2/2} o~(K) and both y\ and 2/2 are 
of degree rs + 1, is ai /east s/4. 

Proof. The degree condition is satisfied with probability at least 1/4. Given that the degree condi- 
tion is satisfied, it is enough to show that for a bipartite graph K' of type ({(r,rs)},{(r,rs)}) the 
probability over a £ S n that o~(K') contains the fixed edge (2/1,2/2) is at least s, since K is such a 
graph plus some additional edges. 

Because of symmetry, this probability doesn't depend on the choice of the edge, let's denote it 
by p. Let K\, . . . , K c be an enumeration of all bipartite graphs isomorphic to K' . We will count 
in two different ways the cardinality % °f the set {(e, h) : e G Kh]. Every Kh contains sr 2 edges, 
therefore x = csr 2 . On the other hand, every edge appears in pc graphs, therefore x = r 2 pc, and 
thus p = s. □ 

4.2 Loading H 

When u = k, the constructed learning graph determines if H is a subgraph of the input graph, 
since a copy of H is loaded on positive instances. Choosing the parameters s, r to optimize the 
total cost gives the following theorem. 

Theorem 9. Let H be a graph on k > 3 vertices and m > 1 edges. Then there is a 
quantum query algorithm for determining if H is a subgraph of an n-vertex graph making 
( n 2-2/(fc+l)-fc/((fc+l)(m+l))) many quer i es . 



Proof. By Theorem 1 , it suffices to show that the learning graph Q\ has the claimed complexity. 
We will use Lemma 2 and upper bound the learning graph complexity by the sum of the costs of 
the stages. As usual, we will ignore factors of k. 
The complexity of stage is: 

S' = 0{sr 2 ) . 

The complexity of each stage 1, . . . , k, and also their sum, is dominated by the complexity of stage 
k: 

U' = {sr y /n{n/r) { - k - 1 ^' 2 \ . 

The complexity of stage k + 1 is: 

U" = O (r(n/r) k/2 ^ 
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Again, the complexity of each stage k + 2, . . . , k + m + 1, and also their sum, is dominated by the 
complexity of stage k + m + 1 : 



U 



O ( rinjr) 



fe/2 s -(m-l)/2 



Observe that 17" = O (£/"'). 

Therefore the overall cost can be bounded by S' + U' + U'" . Choosing r = n l ~ l K k+1 } makes S' = 
U' for any value of s, as their dependence on s is the same. When s = 1 we have U'" < S' = U thus 
we can choose s < 1 to balance all three terms. Letting s = n~* we have S' = U' = 0(n 2 ~ 2 /( fc+1 )~*) 
and U'" = 0( n i+(fe-2)/(2(fc+i))+t(m-i)/2^ Making these equal gives t = k /(( k + !)( m + ^ and 

gives overall cost 0(n 2-2 /( fc+1 )~*). □ 



4.3 Loading the full graph but one vertex 

Recall that H is a graph on vertex set {1, 2, . . . , k}, with k > 3 vertices, m > 1 edges and minimum 
degree d > 1. By renaming the vertices, if necessary, we assume that vertex k has degree d. 

Our second algorithm employs the learning graph Q\ of Section 4.1 with u = k — 1 to first load 
H\x t k— 11- This is then combined with search to find the missing vertex and a collision subroutine 
to verify it links with iJfi^—i] to form H. 

Again, let i2hfc_i] be the subgraph of H induced by vertices 1, 2, . . . , k — 1, and let ei, . . . , e m / 
be the edges of H\i k—i]i enumerated in some fixed order. Thus note that m = m! + d. For any 
positive input graph y, we fix k vertices ai,a2, ■■■ ,a>k such that {ai,aj} is an edge of y whenever 
{i, j} is an edge of H. For notational convenience we assume that a& is of degree d and connected 
to ai, . . . ,ad- 



Learning graph Q2: 



Stages 0, 1, . . . , k + m': Learning graph Q\ of Section 4.1 



Stage k + mf + 1 
which link v to Hm y. 



We use search plus a d-wise collision subroutine to find a vertex v and d edges 
x] to form ii\ The learning graph for this subroutine is given in Section 4.4 



Complexity analysis of the stages All stages but the last one have been analyzed in Sec 



tion 4.1, therefore only the last stage remains to study. 



Stage k + m' + 1: Let Vk+ m '+i be the set of L-vertices at the beginning of stage k + m' + 1. 
We will evaluate the complexity of this stage in a similar fashion as we have done previously. 
As S n acts transitively on the flows, by |Lemma~6l we can invoke Lemma 3 and it suffices 
to consider the maximum of C(E^) over equivalence classes [u]. Furthermore, as we have 
argue d in Section 4.1 , the learning graph also satisfies the conditions of Lemma 7| thus we can 
apply 



Lemma 4 



to evaluate C{Et\). The maximum vertex ratio over [u] is 0(s m ' (n/r) 



As shown m Section 4.4| the complexity of the subroutine learning graph attached to each 



v E Vk+m'+i is & t most 0{^/nr d ^ d+1 ^). Thus by Lemma 4 the complexity of this stage is 



O 



-ml 12 



»l\(fc-l)/2 



??./' 



d/(d+l) 
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Choosing the parameters s, r to optimize the total cost gives the following theorem. 

Theorem 10. Let H be a graph on k > 3 vertices with minimal degree d > 1 and m edges. Then 
there is a quantum query algorithm for determining if H is a subgraph of an n-vertex graph making 

( n 2-2/fc-(2fc-d-3)/(fc(d+l)(m-d+2))) many queries _ 



Proof. By Theorem 1 , it suffices to show that the learning graph Q2 has the claimed complexity. 
We will use Lemma 2 and upper bound the learning graph complexity by the sum of the costs of 
the stages. As usual, we will ignore factors of k. 
The complexity of stage is: 

S' = 0(sr 2 ) . 

The complexity of each stage 1, . . . , k — 1, and also their sum, is dominated by the complexity of 
stage k — 1: 

U' = (srv^(n/r)( fc - 2 )/ 2 ) . 



The complexity of stage k is: 



U' 



" = O (r(n/r)^l 2 ) 



Again, the complexity of each stage k + 1, . . . , k + m', and also their sum, is dominated by the 
complexity of stage k + m': 

U'" = O (rCn/r)^/ 2 ^" 1 '- 1 )/ 2 ) . 
Observe that U" = 0(U"'). Finally, denote the cost of stage k + m! + 1 by 



r ' 



Observe that U'" = 0(C), provided that r l ^ d+1 h l/2 = 0{n 1 / 2 ). The later is always satisfied 
since s < 1, r < n and d > 1. Therefore the overall cost can then be bounded by S' + U' + C . 
Choosing r = n 1_1 / fc makes S' = U' for any value of s, as their s dependence is the same. When 
s = 1 we have C < S' = U' thus we can choose s < 1 to balance all three terms. Letting s = n~* we 
have S' = U' = 0(n 2 ~ 2 / fc -*) and C = 0( n 2-2/fc+i/(2fc)-(fc-i)/(fc(d+i))+tm'/2)_ Making these equal 

gives t = (2k — d — 3)/(k(d + l)(m' + 2)). Since k > 3 we have t > and thus s < 1. The overall cost 
of the algorithm is 0(n 2-2 / fc ~ 4 ). Noting that m = m! + d gives the statement of the theorem. □ 



Our main result is an immediate consequence of Theorem [9] and Theorem 10 



Theorem 11. Let H be a graph on k > 3 vertices with minimal degree d > 1 and m edges. Then 
there is a quantum query algorithm for determining if H is a subgraph of an n-vertex graph making 
0(n 2 ~ 2 / fc ~*) many queries, where 



t = max 



J k 2 -2(m + l) 2k -d- 3 

\ k(k + l)(m + 1)' k(d + l)(m - d + 2) 
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4.4 Graph collision subroutine 

In this section we describe a learning graph for the graph collision subroutine that is used in the 



learning graph given in Section 4.3. For each vertex v at the end of stage k + m' we will attach a 
learning graph Q v . The root of Q v will be the label of v and we will show that it has complexity 
^Jnr d K d+1 \ Furthermore for every flow p y on Q v , the sinks of flow will be L-vertices that have 
loaded a copy of H. We now describe Q v in further detail. 

A vertex v at the end of stage k + m! is labeled by a (k — l)-partite graph Q on color classes 
B\, . . . , Bk-i of size r. The edges of Q are the union of the edges in bipartite graphs Qi, . . . , Q m > 
each of type ({(r/2 - l,rs), (r/2 + l,rs + l)},{(r/2 - l,rs), (r/2 + l,rs + 1)}). This will be the 
label of the root of Q v . 

On Q v we define a flow for every input y such that Pyiv ) > in the learning graph loading 
■fffl.fe— l]' Say that y contains a copy of H and that vertices a\, . . . ,ak span H in y. For ease of 
notation, assume that vertex (the degree d vertex removed from H) is connected to a%, . . . , cy. 
Recall that the L- vertex v will have flow if and only if a% G L>,, G" Bi for i = 1, . . . , k — 1 and 
if = {i, j} then the edge {aj, a.,} is present in Qi for ^ = 1, . . . , m', and both Oj, a,j have degree 
rs + 1 in Qi. Thus for each such y we will define a flow on Q v . The flow will only depend on 
ai, . . . , dfc. The complexity of will depend on a parameter 1 < A < r, that we will optimize later. 

Stage 0: Choose a vertex u tfL B^ for i = 1, ... k — 1 and load A edges between u and vertices 
of degree rs + 1 in Bi , for each i = 1 , . . . d. Flow is directed uniformly along those L-edges where 
u = a>k and none of the edges loaded touch any of the a±, . . . , a^. 

Stage t for t = l,...,d: Load an additional edge between u and Bf. The flow is directed 
uniformly along those L-edges where the edge loaded is {ofc,a 4 }. 

Complexity analysis of the stages 



Stage 0: We use Lemma 4} As the vertices at the beginning of this stage consist only of the 



root, conditions (1) and (2) are trivially satisfied; Condition (3) is satisfied by construction. 
Flow is present in all L-edges of this stage where u = a^, which is a Q(l/n) fraction of the 
total number of L-edges. Thus the degree ratio d/g = 0(n). The length of the stage is A, 
giving a total cost of A-y/n. 



Stage t for t = 1, . . . , d: Let Vt be the set of vertices at the beginning of stage t. The definition 



of flow depends only onai,...,^, thus S n acts transitively on the flows. Applying |Lemma 6 
gives that {p'(y)} is consistent with [u] + for u G Vt. Also by construction the hypothesis of 
Lemma 7| is satisfied, thus we are in position to use Lemma 5 



The length of each stage is 1. The out-degree of an L- vertex in stage t is 0(r) while the flow 
uses just one outgoing edge, thus the degree ratio d/g = 0(r). Finally, we must estimate the 
fraction of vertices in [u] with flow for u G Vt- A vertex u in Vt has flow if and only if 
was loaded in stage and the edges {a^, ai} are loaded for i = 1, . . . , t — 1. The probability 
over a G S n that the first event holds in cr(u) is 0(l/n). Given that has been loaded at 
vertex u G 14 the probability over a that {a^, m} G cr(u) is fi(A/r). Thus we obtain that the 
maximum vertex ratio at stage t is n(r/A)' _1 . The complexity of stage t is maximized when 
t = d, giving an overall complexity y/nr(r / 'A)(* -1 )/ 2 . 
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The sum of the costs \\fn and y/rvr(r/\)W is minimized for A = r d ^ d+l "> giving a cost of 
0( v / n r d /( d + 1 )). 

4.5 Comparison with the quantum walk approach 

It is insightful to compare the cost of the learning graph algorithm for finding a subgraph with the 
the algorithm of [MSS07] using a quantum walk on the Johnson graph. We saw in the analysis 
of the learning graph that there were three important terms in the cost, denoted S',U',C. In 
the quantum walk formalism there are also three types of costs: setup, aggregated update, and 
aggregated checking, which we will denote by S, U, C. When the walk is done on the Johnson 
graph with vertices labeled by r-element subsets these costs are 



S = r 2 




Here d is the minimal degree of a vertex in H. 

Here there is only one parameter, and in general r cannot be chosen to make all three terms 
equal. In the case of triangle finding (k = 3,d = 2), the choice r = n 3 / 5 is made. This makes 
S = n 1 ' 2 and U = C = n L3 . In the general case of finding H, the choice r = is made, giving 

the first and second terms equal to n 2 ~~ 2 l k and the third term C = n 2 -i/fc(i+fe/(ci+i)+((i-i)/2((i+i))_ 
Thus C < S = U even for the largest possible value d = k — 1. Because of this, the analysis gives 
n 2-2/k q Uer i es f or ari y graph Q n k vertices, independent of d. 
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